Promote dev to main: fix(triggers) respond-to-ci on mixed-state SHA#1242
Merged
zbigniewsobiecki merged 1 commit intomainfrom Apr 30, 2026
Merged
Promote dev to main: fix(triggers) respond-to-ci on mixed-state SHA#1242zbigniewsobiecki merged 1 commit intomainfrom
zbigniewsobiecki merged 1 commit intomainfrom
Conversation
…uite-success (#1241) GitHub fires `check_suite.completed` once per workflow. When workflow A's suite fails fast (e.g. the E2B template-rebuild on ucho/PR#176 — 38s) and workflow B's suite is still running, the failure handler correctly defers with "not all complete yet". When workflow B's suite eventually completes with `conclusion=success`, only the success handler fires — and it unconditionally dispatches `review`. That review is silently skipped at worker time (`pollWaitForChecks` sees `allPassing=false`), and no later event with `conclusion=failure` ever wakes the failure handler back up. Net: `respond-to-ci` is permanently lost for that SHA. Fix: in the success handler, after the author + base gates, query the aggregate `getCheckSuiteStatus` and fork. When `allComplete && anyFailed`, dispatch `respond-to-ci` (gated by its own trigger config + attempt limit) instead of `review`. Otherwise, current behavior — dispatch review with `waitForChecks: true` so the worker polls if checks are still in progress. Single-source the dispatch envelope: extract `dispatchRespondToCi` plus `fixAttempts` / `MAX_ATTEMPTS` / `resetFixAttempts` into a new shared module `respond-to-ci-dispatch.ts`. Both handlers converge on it. The failure handler keeps its early `gateTriggerEnabled` so disabled projects don't burn GitHub API calls; the helper re-checks the same gate so the success-handler fork is also guarded. Tests: 5 new TDD cases on the success handler — failure conclusion, timed_out conclusion, in-progress checks (defers to worker), all-passing (review), and the "already-reviewed at HEAD with CI failure" case (CI failure trumps prior approval). Adjusted the existing pre-existing "getCheckSuiteStatus not called" assertions to match the new flow: removed where the call now happens (post-base-gate), kept where the gate short-circuits before it. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promotes 1 commit from
dev:What changed
Closes the long-standing bug where a fast-failing sibling check_suite (e.g.
E2B Template Rebuild) on a PR head SHA causedrespond-to-cito be permanently lost — because the failure event arrived before the rest of CI completed (correctly deferring), and the eventual success event only ever dispatchedreview. The success handler now queries aggregate state and forks: if any check on the SHA failed, dispatchrespond-to-ciinstead ofreview.Live incident:
uchoPR #176 on 2026-04-30 — full root-cause + fix history in #1241.Risk
Low. Pure trigger-handler change, fully unit-tested. Behavior change is additive: the new code path only fires when an existing situation (success-event-after-mixed-failure) currently silently drops a respond-to-ci dispatch. All existing happy paths preserved.
CI on dev
CI(lint + test 1-4 + Docker validate)Build and Deploy (Dev)Push on dev🤖 Generated with Claude Code